Modeling bursts and heavy tails in human dynamics 
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The dynamics of many social, teciinological and economic phenomena are driven by individual human ac- 
tions, turning the quantitative understanding of human behavior into a central question of modern science. 
Current models of human dynamics, used from risk assessment to communications, assume that human actions 
are randomly distributed in time and thus well approximated by Poisson processes. Here we provide direct 
evidence that for five human activity patterns, such as email and letter based communications, web browsing, 
library visits and stock trading, the timing of individual human actions follow non-Poisson statistics, charac- 
terized by bursts of rapidly occurring events separated by long periods of inactivity. We show that the bursty 
nature of human behavior is a consequence of a decision based queuing process: when individuals execute tasks 
based on some perceived priority, the timing of the tasks will be heavy tailed, most tasks being rapidly executed, 
while a few experiencing very long waiting times. In contrast, priority blind execution is well approximated 
by uniform interevent statistics. We discuss two queueing models that capture human activity. The first model 
assumes that there are no limitations on the number of tasks an individual can hadle at any time, predicting 
that the waiting time of the individual tasks follow a heavy tailed distribution P{tw) ~ with a — 3/2. 
The second model imposes limitations on the queue length, resulting in a heavy tailed waiting time distribution 
characterized by a = 1. We provide empirical evidence supporting the relevance of these two models to human 
activity patterns, showing that while emails, web browsing and library visitation display a — 1, the surface mail 
based communication belongs to the a — 3/2 universality class. Finally, we discuss possible extension of the 
proposed queueing models and outline some future challenges in exploring the statistical mechanisms of human 
dynamics. These findings have important implications not only for our quantitative understanding of human 
activity patterns, but also for resource management and service allocation in both communications and retail. 



I. INTRODUCTION 

Humans participate on a daily basis in a large number of 
distinct activities, from electronic communication, such as 
sending emails or browsing the web, to initiating financial 
transactions or engaging in entertainment and sports. Given 
the number of factors that determine the timing of each action, 
ranging from work and sleep patterns to resource availability, 
it appears impossible to seek regularities in the apparently ran- 
dom human activity patterns, apart from the obvious daily and 
seasonal periodicities. Therefore, in contrast with the accurate 
predictive tools common in physical sciences, forecasting hu- 
man and social patterns remains a difficult and often elusive 
goal. Yet, the need to understand the timing of human ac- 
tions is increasingly important. Indeed, uncovering the laws 
governing human dynamics in a quantitative manner is of ma- 
jor scientific interest, requiring us to address the factors that 
determine the timing of human actions. But these questions 
are driven by applications as well: most human actions have 
a strong impact on resource allocation, from phone line avail- 
ability and bandwidth allocation in the case of Internet or Web 
use, all the way to the design of physical space for retail or 
service oriented institutions. Despite these fundamental and 
practical driving forces, our understanding of the timing of 
human initiated actions is rather limited at present. 

To be sure, the interest in addressing the timing of events in 
human dynamics is not new: it has a long history in the math- 
ematical literature, leading to the development of some of the 
key concepts in probability theory 1 1], and has reemerged at 



the beginning of the 20* century as the design problems sur- 
rounding the phone system required a quantitative understand- 
ing of the call patterns of individuals. But most current models 
of human activity assume that human actions are performed 
at constant rate, meaning that a user has a fixed probability 
to engage in a specific action within a given time interval. 
These models approximate the timing of human actions with a 
Poisson process, in which the time interval between two con- 
secutive actions by the same individual, called the waiting or 
inter-event time, follows an exponential distribution |2]. Pois- 
son processes are at the heart of the celebrated Erlang formula 
predicting the number of phone lines required in an in- 
stitution, and they represent the basic approximation in the 
design of most cuiTently used Internet protocols and routers 
1 4]. Yet, the availability of large datasets recording selected 
human activity patterns increasingly question the validity of 
the Poisson approximation. Indeed, an increasing number of 
recent measurements indicate that the timing of many human 
actions systematically deviate from the Poisson prediction, the 
waiting or inter-event times being better approximated by a 
heavy tailed or Pareto distribution Q,0>0la]- The difference 
between a Poisson and a heavy tailed behavior is striking: the 
exponential decay of a Poisson distribution forces the con- 
secutive events to follow each other at relatively regular time 
intervals and forbids very long waiting times. In contrast, the 
slowly decaying heavy tailed processes allow for very long 
periods of inactivity that separate bursts of intensive activity. 

We have recently proposed that the bursty nature of human 
dynamics is a consequence of a queuing process driven by hu- 



man decision making f^: whenever an individual is presented 
with multiple tasks and chooses among them based on some 
perceived priority parameter, the waiting time of the various 
tasks will be Pareto distributed. In contrast, first-come-first- 
serve and random task execution, common in most service 
oriented or computer driven environments, lead to a uniform 
Poisson-like dynamics. Yet, this work has generated just as 
many questions as it resolved. What are the different classes 
of processes that are relevant for human dynamics? What 
determines the scaling exponents? Do we have discrete uni- 
versality classes (and if so how many) as in critical phenom- 
ena |9], or the exponents characterizing the heavy tails can 
take up arbitrary values, as it is the case in network theory 
lllftllliri2ll ? Is human dynamics always heavy tailed? 

In this paper we aim to address some of these questions 
by studying the different universality classes that can appear 
as a result of the queuing of human activities. We first re- 
view, in Sect. the frequently used Poisson approximation, 
which predicts an exponential distribution of interevent times. 
In Sect. |in]we present evidence that the interevent time prob- 
ability density function (pdf) P{t) of many human activities 
is characterized by the power law tail 



P(r) 
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In Sect. llVl we discuss the general characteristics of the queue- 
ing models that govern how humans time their various activi- 
ties. In Sects. IVIVII we study two classes of queuing models 
designed to capture human activity patterns. We find that re- 
strictions on the queue length play an important role in deter- 
mining the scaling of the queuing process, allowing us to doc- 
ument the existence of two distinct universality classes, one 
characterized by a = 3/2 (Sect. |V) and the other by a = 1 
(Sect. IVU . In Sect. I VIII we discuss the relationship between 
interevent and waiting times. Finally, in Sec. I Villi we dis- 
cuss the applicability of these models to explain the empirical 
data, as well as outline future challenges in modeling human 
dynamics. 
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FIG. 1: The difference between the activity patterns predicted by a 
Poisson process (top) and the heavy tailed distributions observed in 
human dynamics (bottom), (a) Succession of events predicted by a 
Poisson process, which assumes that in any moment events take place 
with probability q. The horizontal axis denotes time, each vertical 
line corresponding to an individual event. Note that the interevent 
times are comparable to each other, long delays being virtually ab- 
sent, (b) The absence of long delays is visible on the plot show- 
ing the delay times r for 1,000 consecutive events, the size of each 
vertical line corresponding to the gaps seen in (a), (c) The prob- 
ability of finding exactly n events within a fixed time interval is 
P(n; q) — e~'''(gt)"/n!, which predicts that for a Poisson process 
the inter-event time distribution follows P{t) — qe~'''^, shown on a 
log-linear plot in (c) for the events displayed in (a, b). (d) The suc- 
cession of events for a heavy tailed distribution, (e) The waiting time 
T of 1,000 consecutive events, where the mean event time was chosen 
to coincide with the mean event time of the Poisson process shown 
in (a-c). Note the large spikes in the plot, corresponding to very long 
delay times, (b) and (e) have the same vertical scale, allowing to 
compare the regularity of a Poisson process with the bursty nature of 
the heavy tailed process, (f) Delay time distribution P(r) ~ 
for the heavy tailed process shown in (d,e), appearing as a straight 
line with slope -2 on a log-log plot. The signal shown in (d-f) was 
generated using 7 = 1 in the stochastic priority list model discussed 
in Appendix IaI 



II. POISSON PROCESSES 

Consider an activity performed with some regularity, such 
as sending emails, placing phone calls, visiting a library, or 
browsing the web. We can keep track of this activity by 
recording the timing of each event, for example the time each 
email is sent by an individual. The time between two consec- 
utive events we call the interevent time for the monitored ac- 
tivity and will be denoted by t. Given that the interevent time 
can be explicitly measured for selected activities, it serves as a 
test of our ability to understand and model human dynamics: 
proper models should be able to capture its statistical proper- 
ties. 

The most primitive model of human activity would assume 
that human actions are fundamentally periodic, with a period 
determined by the daily sleep patterns. Yet, while certain peri- 
odicity is certainly present, the timing of most human actions 
are highly stochastic. Indeed, periodic models are hopeless in 



capturing the time we check out a book from the library, be- 
yond telling us that it should be within the library's operation 
hours. The first and still most widely used stochastic model of 
human activity assumes that the tasks are executed indepen- 
dently from each other at a constant rate A, so that the time 
resolved activity of an individual is well approximated by a 
Poisson process 01 ■ In this case the probability density func- 
tion (pdf) of the recorded interevent times has the exponential 
form 

P(r) - \e-^- . (2) 

In practice this means that the predicted activity pattern, while 
stochastic, will display some regularity in time, events follow- 
ing each other on average at t « (r) = 1 /A intervals. Indeed, 
given that for a Poisson process a = y/ (t^) — (r)^ (r) is 
finite, very long waiting times (i.e. large temporal gaps in the 
sequence of events) are exponentially rare. This is illustrated 
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in Fig. [TJ, where we show a sequence of events generated by a 
Poisson process, appearing uniformly distributed in time (but 
not periodic). 

The Poisson process was originally introduced by Poisson 
in his major work applying probability concepts to the admin- 
istration of justice lll3ll . Today it is widely used to quantify the 
consequences of human actions, such as modeling traffic flow 
patterns or accident frequencies |2], and is commercially used 
in call center staffing O, inventory control fisll . or to esti- 
mate the number of congestion caused blocked calls in mobile 
communications |4|. It has been established as a basic model 
of human activity patterns at a time when data collection ca- 
pabilities on human behavior were rather limited. In the past 
few years, however, thanks to detailed computer based data 
collection methods, there is increasing evidence that the Pois- 
son approximation fails to capture the timing of many human 
actions. 



m. EMPIRICAL RESULTS 

Evidence that non-Poisson activity patterns characterize 
human activity has first emerged in computer communica- 
tions, where the timing of many human driven events is au- 
tomatically recorded. For example, measurements capturing 
the distribution of the time differences between consecutive 
instant messages sent by individuals during online chats 1 16] 
have found evidence of heavy tailed statistics. Professional 
tasks, such as the timing of job submissions on a supercom- 
puter fvf\, directory listings and file transfers (FTP requests) 
initiated by individual users 1 18] were also reported to display 
non-Poisson features. Similar patterns emerge in economic 
transactions LI 9.. .20.] . in the number of hourly trades in a given 
security ll2lll or the time interval distribution between individ- 
ual trades in currency futures |22]. Finally, heavy tailed distri- 
butions characterize entertainment related events, such as the 
time intervals between consecutive online games played by 
users flfl. Note, however, that while these datasets provide 
clear evidence for non-Poisson human activity patterns, most 
of them do not resolve individual human behavior, but capture 
only the aggregated behavior of a large number of users. For 
example, the dataset recording the timing of the job submis- 
sions looks at the timing of all jobs submitted to a computer, 
by any user. Thus for these measurements the interevent time 
does not characterize a single user but rather a population of 
users. Given the extensive evidence that the activity distribu- 
tion of the individuals in a population is heavy tailed, these 
measurements have difficulty capturing the origin of the ob- 
served heavy tailed patterns. For example, while most people 
send only a few emails per day, a few send a very large number 
on a daily basis l24ll25ll . 

If the activity pattern of a large number of users is simul- 
taneously captured, it is not clear where the observed heavy 
tails come from: are they rooted in the activity of a single 
individual, or rather in the heavy tailed distribution of user ac- 
tivities? Therefore, when it comes to our quest to understand 
human dynamics, datasets that capture the long term activity 
pattern of a single individual are of particular value. To our 
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FIG. 2: (a) The interevent time distribution between (a) two consecu- 
tive visits of a webportal by a single user; (b) two consecutive library 
loans made by a single individual; (c) two consecutive emails sent 
out by a user. For (a-c) we show as a straight line the a = 1 scaling, 
(d) The interevent time distribution between two consecutive trans- 
actions made by a stock broker. The distribution follows a power-law 
with the exponential cut-off P(r) ~ t~^'^ exp(— r/ro). (e-g) The 
distribution of the exponents (a) characterizing the interevent time 
distribution of users browsing the web portal (e), individual loans 
from the library (f) and the emails sent by different individuals (g). 
The exponent a was determined only for users whose total activ- 
ity levels exceeded certain thresholds, the values used being 15 web 
visits (e), 15 emails (f) and 10 books (g). (h,l) We numerically gen- 
erate for 10,000 individuals interevent time distributions following a 
power-law with exponent a = 1. The distribution of the measured 
exponents follows a normal distribution similar to the distribution ob- 
served in (e-g). If we double the time window of the simulation (h) 
the deviation around the average becomes much smaller (1). (i-k) The 
distribution of the number of events in the studied systems: number 
of HTML hits for each user (i), the number of books checked out by 
each user (j) and the number of emails sent by different individuals 
(k), indicating that the overall activity patterns of individuals is also 
heavy tailed. 



best knowledge only three papers have taken this approach, 
capturing the timing of printing jobs submitted by users L26I1 . 
the email activity patterns of individual email users jsl l24ll 
and the browsing pattern of users visiting a major web por- 
tal 0]. These measurements offer direct evidence that the 
heavy tailed activity patterns emerge at the level of a single 
individual, and are not a consequence of the heterogeneous 
distribution of user activity. Despite this evidence, a num- 
ber of questions remain unresolved: Is there a single scaling 
exponent characterizing all users, or rather each user has its 
own exponent? What is the range of these exponents? Next 
we aim to address these questions through the study of six 
datasets, each capturing individual human activity patterns of 



4 



different nature. First we describe the datasets and the collec- 
tion methods, followed by a quantitative characterization of 
the observed human activity patterns. 

Web browsing: Automatically assigned cookies allow us 
to reconstruct the browsing history of approximately 250,000 
unique visitors of the largest Hungarian news and entertain- 
ment portal (origo.hu), which provides online news and mag- 
azines, community pages, software downloads, free email and 
search engine, capturing 40% of all internal Web traffic in 
Hungary |t,'27]. The portal receives 6,500,000 HTML hits 
on a typical workday. We used the log files of the portal to 
collect the visitation pattern of each visitor between 1 1/08/02 
and 12/08/02, recording with second resolution the timing of 
each download by each visitor |7]. The interevent time, r, was 
defined as the time interval between consecutive page down- 
loads (clicks) by the same visitor. 

Email activity patterns: This dataset contains the email ex- 
change between individuals in a university environment, cap- 
turing the sender, recipient and the time of each email sent 
during a three and six month period by 3,188 i24ll and 9,665 
f2?l users, respectively. We focused here on the data collected 
by Eckmann |24], which records 129,135 emails with second 
resolution. The interevent time corresponds to the time be- 
tween two consecutive emails sent by the same user 

Library loans: The data contains the time with second res- 
olution at which books or periodicals were checked out from 
the library by the faculty at University of Notre Dame dur- 
ing a three year period. The number of unique individuals in 
the dataset is 2,247, together participating in a total of 48,409 
transactions. The interevent time corresponds to the time dif- 
ference between consecutive books or periodicals checked out 
by the same patron. 

Trade transactions: A dataset recording all transactions 
(buy/sell) initiated by a stock broker at a Central European 
bank between 06/99 and 5/03 helps us quantify the profes- 
sional activity of a single individual, giving a glimpse on the 
human activity patterns driving economic phenomena. In a 
typical day the first transactions start at 7AM and end at 7PM 
and the average number of transactions initiated by the dealer 
in one day is around 10, resulting in a total of 54,374 trans- 
actions. The interevent time represents the time between two 
consecutive transactions by the broker The gap between the 
last transaction at the end of one day and the first transaction 
at the beginning of the next trading day was ignored. 

The correspondence patterns of Einstein, Darwin and 
Freud: We start from a record containing the sender, recipient 
and the date of each letter 1 28i, i29,, .30,1 sent or received by 
the three scientists during their hfetime. The databases used 
in our study were provided by the Darwin Correspondence 
Project (http://www.lib.cam.ac.uk/Departments/Darwin/), the 
Einstein Papers Project (http://www.einstein.caltech.edu/ 1 
and the Freud Museum of London (http://www.freud.org.uk). 
Each dataset contains the information about each 
sent/received letter in the following format: SENDER, 
RECIPIENT, DATE, where either the sender or the recipient 
is Einstein, Darwin or Freud. The Darwin dataset contained 
a record of a total of 7,591 letters sent and 6,530 letters 
received by Darwin (a total of 14,121 letters). Similarly, the 



Einstein database contained 14,512 letters sent and 16,289 
letters received (total of 30,801). For Freud we have 3183 
(2675) sent (received) leters. Note that 1,541 letters in the 
Darwin database and 1,861 letters in the Einstein database 
were not dated or were assigned only potential time intervals 
spanning days or months. We discarded these letters from 
the dataset. Furthermore, the dataset is naturally incomplete, 
as not all letters written or received by these scientists were 
preserved. Yet, assuming that letters are lost at a uniform 
rate, they should not affect our main findings. For these 
three datasets we do not focus on the interevent times, 
but rather the response or waiting times r^. The waiting 
time, r^, represents the time interval between the date of 
a letter received from a given person, and the date of the 
next letter from Darwin, Einstein or Freud to him or her, i.e. 
the time the letter waited on their desk before a response 
is being sent. To analyze Einstein, Darwin, and Freud's 
response time we have followed the following procedure: if 
individual A sent a letter to Einstein on DATEl, we search 
for the next letter from Einstein to individual A, sent on 
DATE2, the response time representing the time difference 
T = DATE2 - DATEl, expressed in days. If there are 
multiple letters from Einstein to the recipient, we always 
consider the first letter as the response, and discard the later 
ones. Missing letters could increase the response time, the 
magnitude of this effect depending on the overall frequency 
of communication between the respective correspondence 
partners. Yet, if the response time follows a distribution 
with an exponential tail, then randomly distributed missing 
letters would not generate a power law waiting time: they 
would only shift shift the exponential waiting times to longer 
average values. Thus the observed power law cannot be 
attributed to data incompleteness. 

In the following we will break our discussion in three sub- 
sections, each focusing on a specific class of behavior ob- 
served in the studied individual activity patterns. 



A. The a = 1 universality class: Web browsing, email, and 
library datasets 

In Fig. |2^-c we show the interevent time distribution be- 
tween consecutive events for a single individual for the first 
four studied databases: Web browsing, email, and library vis- 
itation. For these datasets we find that the interevent time dis- 
tribution has a power-law tail 

P(r) ~ r-" (3) 

with exponent awl, independent of the nature of the ac- 
tivity. Given that for these activity patterns we collected data 
for thousands of users, we need to calculate the distribution 
of the exponent a determined separatelly for each user whose 
activity level exceeds a certain threshold (i.e. avoiding users 
that have too few events to allow a meaningful determination 
of P{t)). As Fig. |2^-g shows, we find that the distribution of 
the exponents is peaked around a ~ 1. 
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The scattering around a = 1 in the measured exponents 
could have two different origins. First, it is possible that each 
user is characterized by a different scaling exponent a. Sec- 
ond, each user could have the same exponent a = \, but given 
the fact that the available dataset captures only a finite time 
interval from one month to several months, with at best a few 
thousand events in this interval, there are uncertainties in our 
ability to determine numerically the exponent a. To demon- 
strate that such data incompleteness could indeed explain the 
observed scattering, in Figs. |2i and|3 we show the result 
of a numerical experiment, in which we generated 10,000 
time series, corresponding to 10,000 independent users, the 
interevent time of the events for each user being taken from 
the same distribution P{t) ~ t^^. The total length in time of 
each time series was chosen to be 1, 000, 000. We then used 
the automatic fitting algorithm employed earlier to measure 
the exponents in Figs. |2^-g to determine numerically the ex- 
ponent a for each user. In principle for each user we should 
observe the same exponent a = 1, given that the datasets were 
generated in an identical fashion. In practice, however, due to 
the finite length of the data, each numerically determined ex- 
ponent is slightly different, resulting in the histogram shown 
in Fig. |2li. As the figure shows, even in this well controlled 
situation we observe a scattering in the measured exponents, 
obtaining a distribution similar to the one seen in Figs. [2^- 
g. The longer the time series, the sharper the distribution is 
(Fig. |2l), given that the exponent a can be determined more 
accurately. 

The distributions obtained for the three studied datasets are 
not as well controlled as the one used in our simulation: while 
the length of the observation period is the same for each user, 
the activity level of the users differs widely. Indeed, as we 
show in Fig. |2l-k, the activity distribution of the different 
users, representing the number of events recorded for each 
user, also spans several orders of magnitude, following a fat 
tailed distribution. Thus the degree of scattering of the mea- 
sured exponent a is expected to be more significant than seen 
in Fig|2li and 1, since we can determine the exponent accu- 
rately only for very active users, for which we have a signif- 
icant number of datapoints. Therefore, the obtained results 
are consistent with the hypothesis that each user is charac- 
terized by a scaling exponent in the vicinity of a 1, the 
difference in the numerically measured exponent values being 
likely rooted in the finite number of events we record for each 
user in the datasets. This conclusion will be eventually cor- 
roborated by our modeling efforts, that indicate that the expo- 
nents characterizing human behavior take up discrete values, 
one of which, provide the empirically observed a = 1. 

As we will see in the following sections, an important mea- 
sure of the human activity patterns is the waiting time, r^u, 
representing the amount of time a task waits on an individ- 
ual's priority list before being executed. For the email dataset, 
given that we know when a user receives an email from an- 
other user and the time it sends the next email back to her, we 
can determine the email's waiting or response time. There- 
fore, we define the waiting time as the difference between the 
time user A receives an email from user B, and the time A 
sends an email to user B. In looking at this quantity we should 
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FIG. 3: Distribution of the response and arrival time intervals of the 
email user shown in Fig. |2j'. (a) Given two email users A and B, the 
response times of user A to B are the time intervals between A re- 
ceiving an email from B and A sending an email to B. The response 
time distribution of user A is then computed taking into account the 
response times to all users he/she communicates with. The continu- 
ous line is a power law fit with exponent a = 1.0. (b) Given an user 
A, the inter-arrival times are the time intervals between the two con- 
secutive arrivals of an email to user A, independently of the sender. 
The arrival time distribution of user A is obtained taking into account 
all the inter-arrival times for that user. The continuous line is a power 
law fit with exponent 0.98. (c) The real waiting time distribution of 
an email in a user's priority list, where Trcai represents the time be- 
tween the time the user first sees an email and the time she sends a 
reply to it. The black symbol shown in the upper left comer repre- 
sents the messages that were replied to right after the user has noticed 
it. 
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FIG. 4: Distribution of the response times for the letters replied to 
by Einstein, Darwin and Freud, as indicated on each plot. Note that 
the distributions are well approximated with a power law tail with 
exponent a — 3/2. While for Darwin and Einstein the datasets pro- 
vide very good statistics (the power law regime spanning 4 orders 
of magnitude), the plot corresponding to Freud's responses is not so 
impressive, yet still being well approximated by the power law dis- 
tribution. Note that while in most cases the identified reply is indeed 
a response to a received letter, there are exceptions as well: many of 
the very delayed replies represent the renewal of a long lost relation- 
ship. 



be aware of the fact that not all emails A sends to B are direct 
responses to emails received from B, thus there are some false 
positives in the data that could be filtered out only by reading 
the text of each email (which is not possible in the available 
datasets). We have measured the empirically obtained waiting 
time distribution in the email dataset, finding that the distri- 
bution of the response times indeed follows a power law with 
exponent a = 1 (Fig.|3^). 
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B. The a = 3/2 universality class: Tiie correspondence of 
Einstein, Darwin and Freud 

In the case of the correspondence patterns of Einstein and 
Darwin we will focus on the response time of the authors, 
partly because we will see later that this has the most impor- 
tance from the modeling perspective. As shown in Fig. |3 the 
probability that a letter will be replied to in days is well 
approximated by a power law (|3} with a = 3/2, the scal- 
ing spanning four orders of magnitude, from days to years. 
Note that this exponent is significantly different from a — 1 
observed in the earlier datasets, and we will show later that 
modeling efforts indeed establish a = 3/2 as a scaUng expo- 
nent characterizing human dynamics. 

The dataset allows us to determine the interevent times as 
well, representing the time interval between two consecutive 
letters sent by Einstein, Darwin or Freud to any recipient. We 
find that the interevent time distribution is also heavy tailed, 
albeit the quality of scaling is not as impressive as we observe 
for the response time distribution. This is due to the fact that 
we do not know the precise time when the letter is written (in 
contrast with the email, which is known with second resolu- 
tion), but only the day on which it was mailed. Given that both 
Einstein and Darwin wrote at least one letter most days, this 
means that long interevent times are rarely observed. Further- 
more, owing to the long observational period (over 70 years), 
the overall activity pattern of the two scientists has changed 
significantly, going from a few letters per year to as many 400- 
800 letters/year during the later, more famous phase of their 
professional life. Thus the interevent time, while it appears to 
follow a power law distribution, it is by no means stationary. 
More stationarity is observed, however, in the response time 
distribution. 



C. The stock broker activity pattern 

For the stock broker we again focus on the interevent 
time distribution, finding that the best fit follows P(t) ~ 
T^" exp(— t/to) with a = 1.3 and tq = 76 min (see Fig. 
Id). This value is between a = 1 observed for the users in 
the first three other datasets and a = 3/2 observed for the 
correspondence patterns. Yet, given the scattering of the mea- 
sured exponents, it is difficult to determine if this represents 
a standard statistical deviation from a = 1 or a = 3/2, the 
two values expected by the modeling efforts (see Sects. fVland 
IVU . or it stands as evidence for a new universality class. At 
this point we believe that the former case is valid, something 
that can be decided only once data for more users will become 
available. The exponential cutoff is not inconsistent with the 
modelling efforts either: as we will show in Appendix|C] such 
cutoffs are expected to accompany all human activity patterns 
with a <2. 



D. Qualitative differences between heavy tailed and Poisson 
activity patterns 

The heavy tailed nature of the observed interevent time dis- 
tribution has clear visual signatures. Indeed, it implies that an 
individual's activity pattern has a bursty character: short time 
intervals with intensive activity (bursts) are separated by long 
periods of no activity (Figs. Id-f). Therefore, in contrast with 
the relatively uniform activity pattern predicted by the Pois- 
son process, for a heavy tailed process very dense successions 
of events (bursts) are separated by very long gaps, predicted 
by the slowly decaying tail of the power law distribution. This 
bursty activity pattern agrees with our experience of an indi- 
vidual's normal email usage pattern: during a single session 
we typically send several emails in quick succession, followed 
by long periods of no email activity, when we focus on other 
activities. 



IV. CAPTURING HUMAN DYNAMICS: QUEUING 
MODELS 

The empirical evidence discussed in the previous section 
raises several important questions: Why does the Poisson pro- 
cess fail to capture the temporal features of human activity? 
What is the origin of the observed heavy tailed activity pat- 
terns in human dynamics? To address these questions we need 
to inspect closely the processes that contribute to the timing of 
the events in which an individual participates. 

Most of the time humans face simultaneously several work, 
entertainment, and family related responsibilities. Indeed, at 
any moment an individual could choose to participate in one 
of several tasks, ranging from shopping to sending emails, 
making phone calls, attending meetings or talks, going to a 
theater, getting tickets for a sports event, and so on. To keep 
track of the various responsibilities ahead of them, individu- 
als maintain a to do or priority Ust, recording the upcoming 
tasks. While this list is occasionally written or electronically 
recorded, in many cases it is simply kept in memory. A pri- 
ority list is a dynamic entity, since tasks are removed from it 
after they are executed and new tasks are added continuously. 
The tasks on the list compete with each other for the indi- 
vidual's time and attention. Therefore, task management by 
humans is best described as a queuing process ll3lll32il . where 
the queue represents the tasks on the priority list, the server is 
the individual which executes them and maintains the list, and 
some selection protocol governs the order in which the tasks 
are executed. To define the relevant queuing model we must 
clarify some key features of the underlying queuing process, 
ranging from the arrival and service processes to the nature of 
the task selection protocol, and the restrictions on the queue 
length IB U . In the following we discuss each of these ingre- 
dients separately, placing special emphasis on their relevance 
to human dynamics. 

Server: The server refers to the individual (or agent) that 
maintains the queue and executes the tasks. In queuing the- 
ory we can have one or several servers in parallel (like check- 
out counters in a supermarket). Human dynamics is a single 
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server process, capturing the fact that an individual is solely 
responsible for executing the tasks on his/her priority list. 

Task Arrival Pattern: The arrival process specifies the 
statistics of the arrival of new tasks to the queue. In queuing 
theory it is often assumed that the arrival is a Poisson process, 
meaning that new tasks arrive at a constant rate A to the queue, 
randomly and independently from each other. We will use this 
approximation for human queues as well, assuming that tasks 
land at random times on the priority list. If the arrival process 
is not captured by a Poisson distribution, it can be modeled 
as a renewal process with a general distribution of interarrival 
times |31|. For example, our measurements indicate that the 
arrival time of emails follows a heavy tailed distribution, thus 
a detailed modeling of email based queues must take this into 
account. We must also keep in mind that the arrival rate of 
the tasks to the list is filtered by the individual, who decides 
which tasks to accept and place on the priority list and which 
to reject. In principle the rejection of a task is also a decision 
process that can be modeled as a high priority short lived task. 

Service process: The service process specifies the time it 
takes for a single task to be executed, such as the time nec- 
essary to write an email, explore a web page or read a book. 
In queuing theory the service process is often modeled as a 
Poisson process, which means that the distribution of the time 
devoted to the individual tasks has the exponential form 
However, in some applications the service time may follow 
some general distribution. For example, the size distribution 
of files transmitted by email is known to be fat tailed |33, 34|, 
suggesting that the time necessary to review (read) them could 
also follow a fat tailed distribution. In queuing theory it is of- 
ten assumed that the service time is independent of the task ar- 
rival process or the number of tasks on the priority list. While 
we adopt this assumption here as well, we must also keep in 
mind that the service time can decrease if too many tasks are 
in the queue, as humans may devote less time to individual 
tasks when they have many urgent things to do. 

Selection protocol or queue discipline: The selection pro- 
tocol specifies the manner in which the tasks in the queue are 
selected for execution. Most human initiated events require 
an individual to weight and prioritize different activities. For 
example, at the end of each activity an individual needs to de- 
cide what to do next: send an email, do some shopping or 
place a phone call, allocating time and resources for the cho- 
sen activity. Normally individuals assign to each task a prior- 
ity parameter, which allows them to compare the urgency of 
the different tasks on the list. The time a task waits before it 
is executed depends on the method the agent uses to choose 
the task to be executed next. In this respect three selection 
protocols are particularly relevant for human dynamics: 

(/) The simplest is the first-in-first-out (FIFO) protocol, ex- 
ecuting the tasks in the order they were added to the list. This 
is common in service oriented processes, like the first-come- 
first-serve execution of orders in a restaurant or getting help 
from directory assistance and consumer support. 

(//) The second possibility is to execute the tasks in a ran- 
dom order, irrespective of their priority or time spent on the 
list. This is common, for example, in educational settings, 
when students are called on randomly, and in some packet 



routing protocols. 

(///) In most human initiated activities task selection is not 
random, but the individual tends to execute always the highest 
priority item on his/her list. The resulting execution dynamics 
is quite different from (/) and (//): high priority tasks will be 
executed soon after their addition to the list, while low prior- 
ity items will have to wait until all higher priority tasks are 
cleared, forcing them to stay longer on the list. In the fol- 
lowing we show that this selection mechanism, practiced by 
humans on a daily basis, is the likely source of the fat tails 
observed in human initiated processes. 

Queue Length or System Capacity: In most queuing mod- 
els the queue has an infinite capacity and the queue length can 
change dynamically, depending on the arrival and the execu- 
tion rate of the individual tasks. In some queuing processes 
there is a physical limitation on the queue length. For ex- 
ample, the buffers of Internet routers have finite capacity, so 
that packets arriving while the buffer is full are systematically 
dropped. In human activity one could argue that, given the 
possibility to maintain the priority list in a written or elec- 
tronic form, the length of the list has no limitations. Yet, if 
confronted with too many responsibilities, humans will start 
dropping some tasks and not accept others. Furthermore, 
while keeping track of a long priority list is not a problem 
for an electronic organizer, it is well established that the im- 
mediate memory of humans has finite capacity of about seven 
tasks 1 35, 36]. In other words, the number of priorities we can 
easily remember, and therefore the length of our priority list, 
is bounded. These considerations force us to inspect closely 
the difference between finite and an unbounded priority lists, 
and the potential consequences of the queue length on the the 
waiting time distribution. 

In this paper we follow the hypothesis that the empirically 
observed heavy tailed distributions originate in the queuing 
process of the tasks maintained by humans, and seek appropri- 
ate models to explain and quantify this phenomenon. Partic- 
ularly valuable are queuing models that do not contain power 
law distributions as inputs, and yet generate a heavy tailed out- 
put. In the following we will focus on priority queues, reflect- 
ing the fact that humans most likely choose the tasks based on 
their priority for execution. 

In the empirical datasets discussed in Sect|in]we focused 
on both the interevent time and the waiting time distribution 
of the tasks in which humans participate. In the following two 
sections we focus on the waiting time of a task on the pri- 
ority list rather than the interevent times. In this context he 
waiting time, r^, represents the time difference between the 
arrival of a task to the priority list and its execution, thus it 
is the sum of the time a task waits on the list and the time 
devoted to executing it. In Sect. IV III we will return to the re- 
lationship between the empirically observed interevent times 
and the waiting times predicted by the discussed models. 
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MODELS WITH VARIABLE QUEUE LENGTH: q = 3/2 
UNIVERSALITY CLASS 



Our first goal is to explore the behavior of priority queues 
in which there are no restrictions on the queue length. There- 
fore, in these models an individual's priority list could con- 
tain arbitrary number of tasks. As we will show below, such 
models offer a good approximation to the surface mail corre- 
spondence patterns, such as that observed in the case of Ein- 
stein, Darwin and Freud (see Sect. IIIIBt . Therefore, we will 
construct the models with direct reference to the the datasets 
discussed in Sect. [Ill] We assume that letters arrive at rate A 
following a Poisson process with exponential arrival time dis- 
tribution. Replacing letters with tasks, however, provides us a 
more general model, in principle applicable to any human ac- 
tivity. The responses are written at rate /i, reflecting the over- 
all time a person devotes to his correspondence. Each letter 
is assigned a discrete priority parameter x = 1,2, . . . ,r upon 
arrival, such that always the highest priority unanswered letter 
(task) will be always chosen for a reply. The lowest priority 
task will have to wait the longest before execution, and there- 
fore it dominates the waiting time probability density for large 
waiting times. This model was introduced in 1964 by Cob- 
ham |37| to describe some manufacturing processes. Most of 
the analytical work in queuing theory has concentrated on the 
waiting time of the lowest priority task, finding that the wait- 
ing time distribution follows I 38<1 



(4) 



where A and tq are functions of the model parameters, the 
characteristic waiting time tq being given by 



To 
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where p — X/jjis the traffic intensity. Therefore, the waiting 
time distribution is characterized by a power law decay with 
exponent a — 3/2, combined with an exponential cutoff. 

The model can be extended to the case where the priorities 
are not discrete, but take up continuous values < x < oo 
from an arbitrary rj{x) distribution. The Laplace transform of 
the waiting time distribution for this case has been calculated 
in Ref. |31], but the resulting equation is difficult to invert, 
forcing us to study the model numerically (Fig. |5|i. The nat- 
ural control parameter is p = A//i, allowing us to distinguish 
three qualitatively different regimes: 

Subcritical regime, p < 1: Given that the arrival rate of 
the tasks is smaller than the execution rate, the queue will be 
often empty. This significantly limits the waiting time, most 
tasks being executed soon after their arrival. The simulations 
indicate that the waiting time distribution exhibits an asymp- 
totic scaling behavior consistent with @ (Fig. |3- While in 
the p ^ limit we observe mainly the exponential decay, as 
p approaches 1 a power law regime with exponent a — 3/2 
emerges, combined with the exponential cutoff. 
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FIG. 5: Waiting time distribution for tasks in the queueing model 
discussed in Sect. |V|with continuous priorities. The numerical sim- 
ulations were performed as follows: At each step we generate an 
arrival Ta and service time Ts from an exponential distribution with 
rate A and p, respectively. If Ta < Ts or there are no tasks in the 
queue then we add a new task to the queue, with a priority x G [0, 1] 
from uniform distribution, and update the time t t + Ta- Oth- 
erwise, we remove from the queue the task with the largest priority 
and update the time t t + Ts- The waiting time distribution is 
plotted for three p — \/p values: p — 0.9 (circles), p — 0.99 
(squares) and p — 0.999 (diamonds). The data has been rescaled 
to emphasize the scaling behavior P{tw) — r^^^^ /{t^ /tq), where 
To ~ (1 — y^)~^. In the inset we plot the distribution of wait- 
ing times for p = 1.1, after collecting up to lO** (plus) and 10^ 
(diamonds) executed tasks, showing that the distribution of waiting 
times has a power law tail even for p > 1 (supercritical regime). 
Note, however, that in tiiis regime a high fraction of tasks are never 
executed, staying forever on the priority list whose length increases 
linearly with time, a fact that is manifested by a shift to the right of 
the cutoff of the waiting time distribution. 



Critical regime, p = 1 : When the arrival and the response 
rate of the letters are equal, according to and (|5jl we should 
observe a power law waiting time distribution with a = 3/2 
(Fig. |5|i. This regime would imply that, for example, Darwin 
responds to all letters he receive, which is not the case, given 
that their response rate is 0.32 (Darwin), 0.24 (Einstein) and 
0.31 (Freud) |6]. In this case it is easy to show that the queue 
length performs a one-dimensional random walk bounded at 
1 = 0. These fluctuations in the queue length will limit the 
waiting time distribution, as the tasks will wait at most as long 
as it takes for the queue length to return to ^ = 0. Therefore, 
the waiting time distribution will have as upper bound the re- 
turn time distribution of a one-dimensional random walk. It is 
known, however, that the return time distribution of a random 
walker foflows P{t) - t-^/^ |39, 40], which is the origin of 
the 3/2 exponent in Eq. (|3}- This argument indicates that 
is related to the fluctuations in the length of the priority list. 

Supercritical regime, p > 1 : Given that in this regime the 
arrival rate exceeds the response rate, the average queue length 
grows linearly as {l{t)) = (A — iJ,)t. Therefore, a 1 — 1/p 
fraction of the letters is never responded to, waiting indefi- 
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nitely in the queue. Given Darwin, Einstein and Freud's small 
response rate, this regime captures best their correspondence 
pattern. We can measure the waiting time for each letter that 
is responded to. In Fig. |5lwe show the waiting time proba- 
bility density obtained from numerical simulations, indicating 
that it follows a power law with exponent a = 3/2. Thus the 
supercritical regime follows the same scaling behavior as the 
critical regime, but only for the letters that are responded to. 
The rest of the letters wait indefinitely in the list (r^, — oo). 

While the discussed model can indeed generate power law 
waiting time distributions, a critical comparison with the em- 
pirical datasets reveals some notable deficiencies. First, a 
power law distribution emerges only in the critical (p — 1) and 
the supercritical (p > 1) regimes. The critical regime requires 
a careful tuning of the human execution rate, so that the exe- 
cution and the arrival rates are exactly the same. In contrast, 
for p > 1 no tuning is necessary, but the number of tasks on 
the list increases linearly with time, thus many tasks are never 
executed. This limit is probably the most realistic for human 
dynamics: we often take on tasks that we never execute, and 
technically stay on our priority list forever As we discussed 
above, this is documentedly the case for Einstein, Darwin and 
Freud, who answer only a fraction of their letters. However, 
we must not overlook the second important feature of the dis- 
cussed model: the only exponent it can predict is a = 3/2, 
rooted in the fluctuations of the queue length. While this fully 
agrees with the correspondence patterns of Einstein, Darwin 
and Freud, it is significantly higher than the values observed 
in the empirical data discussed in Sect. IIII Al on web brows- 
ing, email communications or library visits, which we found 
to be scattered around a — 1. 



a new task is added to the list, its priority Xi is again chosen 
from rj{x), thus the length L of the list remains unchanged. 
With probability 1 — p the individual executes a randomly se- 
lected task, independent of its priority. The p ^ 1 limit of the 
model describes the deterministic highest-priority-first proto- 
col, when always the highest priority task is chosen for execu- 
tion, while p ^ corresponds to the random choice protocol, 
introduced to mimic the fact that humans occasionally select 
some low priority items for execution, before all higher prior- 
ity items are executed. In the model time is discrete, each task 
execution corresponding to one unit of time. Implicit in this 
assumption is the approximation that the service time distri- 
bution follows a delta function, i.e., each task takes one unit 
time to execute. 

To understand the dynamics of the model we first study it 
via numerical simulations with priorities chosen from a uni- 
form distribution Xi G [0, 1]. The simulations show that in 
the p —> 1 limit the probability that a task spends time on 
the list has a power law tail with exponent a = 1 (Fig. |6^). 
In the p ^ limit P{tw) follows an exponential distribution 
(Fig. as expected for the random selection protocol. As 
the typical length of the priority list differs from individual to 
individual, it is important for the tail of P(r^,) to be indepen- 
dent of L. Numerical simulations indicate that this is indeed 
the case: changes in L do not affect the scaling of Piju,) iH. 
The fact that the scaling holds for L = 2 as well indicates 
that it is not necessary to have a long priority list: even if an 
individual balances only two tasks at the same time, a bursty 
heavy tailed interevent dynamics will emerge. Next we focus 
on the L = 2 case, for which the model can be solved exactly, 
providing important insights into its scaling behavior that can 
be generalized for arbitrary L values as well. 



VI. MODELS WITH FIXED QUEUE LENGTH: a=l 
UNIVERSALITY CLASS 



A. Exact solution for L = 2 



To understand the hmitations of the model discussed in the 
previous section we must remember that when the arrival and 
execution rates are equal (p = 1) the length of the priority 
list follows a random walk, and can thus occasionally take up 
very large values. The situation is even worse for p > 1, when 
the queue length increases linearly with time. Therefore, ac- 
cording to the model an individual must have the capacity to 
keep track of hundreds or thousands of tasks at the same time. 
This may be appropriate for surface mail, where the letters 
pile on our desk until replied to. In contrast, there is exten- 
sive evidence from the psychology literature that the number 
of tasks humans can easily keep in their short term memory is 
bounded 1 35], therefore it is unrealistic that we will remember 
hundreds or thousands of tasks at any given time. This forces 
us to inspect a model in which the length of the priority list 
remains unchanged IS*], a new task being added only when an 
old task is removed from the list (executed). 

We assume that an individual mantains a priority list with 
L tasks, each task being assigned a priority parameter Xi, i = 
1, ...,L, chosen from an ri{x) distribution. At each time step 
with probability p the individual selects the highest priority 
task and executes it, removing it from the list. At that moment 



For L = 2 the waiting time distribution can be determined 
exactly [8] (see AppendixlBl. obtaining 
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independent of 77 (x) from which the task priorities are se- 
lected. In the limit p from (|6} follows that 



lim P(r„)= ( i 



(7) 



i.e. P{tw) decays exponentially, in agreement with the nu- 
merical results (Fig. |6^). This limit corresponds to the random 
selection protocol, where a task is selected with probability 
1/2 in each step. In the p ^ 1 limit we obtain 



lim P(t„) = 
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FIG. 6: (a) Waiting time probability distribution function for the 
model discussed in Sect. IVll for L = 2 and a uniform new task 
priority distribution function, ri{x) = 1, in < a; < 1, as obtained 
from J6j (lines) and numerical simulations (symbols), for p = 0.9 
(squares), p = 0.99 (diamonds) and p — 0.999 (triangles). The in- 
set shows the fraction of tasks with waiting time t = 1, as obtained 
from (lines) and numerical simulations (symbols), (b) Average 
waiting time of executed tasks vs the list size as obtained from <C3> 
(lines) and numerical simulations (symbols), for p = 0.0 (squares), 
p — 0.999 (circles) andp — 1 (diamonds). 



In this case almost all tasks have a waiting time = 1, be- 
ing executed as soon as they were added to the priority list. 
The waiting time of tasks that are not selected in the first step 
follows a power law distribution, decaying with a = I. This 
behavior is illustrated in Fig. |6^ by a direct plot of P{t^) in 
(|6} for a uniform distribution ri{x) in < a: < 1. For p < 1 
the P(tu, ) distribution has an exponential cutoff, which can be 
derived from after taking the ^ oo limit with p fixed, 
resulting in 
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law behavior P(rtt,) ^ 1/tw becomes more prominent. The 
P{tw) curve systematically shifts, however, to lower values 
for > 1, indicating that the power law applies to a van- 
ishing task fraction (see Fig. |6^ and (|9jl). In turn, P(l) 1 
when p ^ 1, corroborated by the direct plot of P{1) as a 
function of p (see inset of Fig. 



B. Numerical results for L > 2 

Based on the results discussed above, the overall behavior 
of the model with a uniform priority distribution can be sum- 
marized as follows. For p = 1, corresponding to the case 
when always the highest priority task is removed, the model 
does not have a stationary state. Indeed, each time the highest 
priority task is executed, there is a task with smaller priority 
Xm left on the list. With probability 1 — the newly added 
task will have a priority larger than Xm, and will be ex- 
ecuted immediately. With probability x„i, however, the new 
task will have a smaller priority, in which case the older task 
will be executed, and the new task will become the 'resident' 
one, with a smaller priority a:;J„ < Xm- For a long period 
all new tasks will be executed right away, until an another 
task arrives with probability x'^ that again pushes the non- 
executed priority to a smaller value x'^ < x'^. Thus with time 
the priority of the lowest priority task will converge to zero, 
a^m (i) — > 0, and thus with a probability converging to one the 
new task will be immediately executed. This convergence of 
Xra to zero implies that for p = 1 the model does not have a 
stationary state. A stationary state develops, however, for any 
p < 1, as in this case there is always a finite chance that the 
lowest priority tasks will also be executed, thus the value of 
Xm will be reset, and will converge to some Xm (p) > 0. This 
qualitative description applies for arbitrary L > 2 values. 

To quantify this qualitative picture we studied numerically 
the L > 2 case assuming that ri{x) is uniformly distributed in 
the < a; < 1 interval. To investigate how fast the system 
approaches the stationary state we compute the average pri- 
ority of the lowest priority task in the queue, {xynin{t)) (see 
Fig. Et'b) since it represents a lower bound for the average 
of any other priorities on the list. We find that for any L val- 
ues (a:min(^)) decreases exponentially up to a time scale to, 
when it reaches a stationary value (a;niin(oo)). The numerical 
simulations indicate that 
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where 



(x^ini^)) ^ {l-p)[-\n(l-p)]'- . (12) 
For L = 2 can calculate (xmin(oo)) exactly f3], obtaining 



To 



In- 



1 



P 



(10) 



Wlien p ^ 1 we obtain that tq ^ oo and, therefore, the expo- 
nential cutoff is shifted to higher t^, values, while the power 
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FIG. 7: Rescaled plot of the average priority of the lowest task prior- 
ity in the list for L = 2 (a) and L = 3 (b) and different values of p 
(see legend). The inset in (b) shows the exponent 8l for different L 
(points), indicating that 6l ~ Q-ijl^^'' for L > 2 (continuous line), 
(c) Rescaled plot of the waiting time distribution for L = 3. Similar 
plots are obtained for larger vales of L (data not shown). 



and therefore 6*2 = 1. For L > 2v4q determined 9l from the 
best data collapse, obtaining the values shown in the inset of 
Fig.0), indicating that 



7L = 



2L-3 



where 6*3 = 0.22 is the value of 6l for i = 3. These results 
support our qualitative discussion, indicating that for all L > 
2 and < p < 1 values the system reaches a stationary state. 

Finally we measured the waiting time distribution after the 
system has reached the stationary state. The results for L — 3 



are shown in Fig. Et, and similar results were obtained for 
other L > 2 values. The data collapse of the numerically 
obtained P{t) indicates that 



P(r) ~ (1 -p)2-exp 
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when L > 2 and r ^ 1, where 
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in the p ^ 1 limit. The simulations indicate that the model's 
behavior for L > 2 is qualitatively similar to the behavior 
derived exactly for L — 2, but different scaling parameters 
characterize the scaling functions. For any L > 2, however, 
the waiting times scale as P(t^) ^ t^^, i.e. we have a = 1. 



C. Comparison with the empirical data 

As the results in the previous subsections show, the model 
proposed to account for the a — 1 universality class has some 
apparent problems. Indeed, for truly deterministic execution 
(p — 1) the model does not have a stationary state. The 
problem was cured by introducing a random task execution 
(p < 1), which leads to stationarity. In this case, however, a 
p dependent fraction of tasks are executed immediately, and 
only the rest of the long lived tasks follow a power law. As p 
converges to zero, the fraction of tasks executed immediately 
diverges, developing a significant gap between the power law 
regime, and the tasks displaying t = 1 waiting time. Is this 
behavior realistic, or represents an artifact of the model? A 
first comparison with the empirical data would suggest that 
this is indeed an artifact, as measurements shown in Fig. |2]and 
|3ldo not provide evidence of a large number of tasks that are 
immediately executed. However, when inspecting the mea- 
surement results we should keep in mind that they represents 
the intervent times, and not the waiting times. In the case 
when the waiting time can be directly measured, like in the 
email or mail based correspondence, there is some ambiguity 
to the real waiting time. Indeed, in the email data, for ex- 
ample, we have measured as waiting time the time difference 
between the arrival of an email, and the response sent to it. 
While this offers an excellent approximation, from an indi- 
vidual's or a priority queue's perspective this is not the real 
waiting time. Indeed, consider the situation when an email 
arrives at 9:00 am, and the recipient does not check her email 
until 11:56am, at which point she replies to the email imme- 
diately. From the perspective of her priority list the waiting 
time was less than a minute, as she replied as soon as she 
saw the email. In our dataset, however, the waiting time will 
be 3 hours and 56 minutes. Thus the way we measured the 
waiting times cannot identify the true waiting time of a task 
on a user's priority list. The email dataset allows us, how- 
ever, to get a much better approximation of the real waiting 
times than we did before. Indeed, for an email ei received 



12 



by user A we record the time ti it arrives, and then the time 
^2 of the first email sent by user A to any other user after the 
arrival of the selected email. It will be this time from which 
we start measuring the waiting time for email ei. Thus if user 
A replies to ei at time t^, we consider that the email's wait- 
ing time Tioai = ^3 — t2, instead of ta — ti considered in Fig 
|3^. The results, shown in Fig |3l;, displays the same power 
law scaling with a = 1 as we have seen in Fig [3^, but in 
addition there is a prominent peak at Tjoai = 1, coorespond- 
ing to emails responded to immediately. Note that the peak's 
magnitude is orders of magnitude larger than the probabilities 
displayed by the large waiting times. This result suggests that 
what we could have easily considered a model artifact in fact 
captures a common feature of email communications. Indeed, 
a high fraction of our emails is responded immediately, right 
after our first chance to read them, as predicted by the prior- 
ity model discussed in this section. Are there models that can 
provide the a = 1 universality class without the high fraction 
of items executed imediatelly? While we have failed to come 
up with any examples, we belive that developing such models 
could be quite valuable. 



VII. RELATIONSHIP BETWEEN WAITING TIMES AND 
INTEREVENT TIMES 

As we discussed above, the empirical measurements pro- 
vide either the interevent time distribution P(r) (Sects. 1111 Al 
and nil Ct or the waiting time distribution P{t^) (Sect. 1111 B> 
of the measured human activity patterns. In contrast the model 
predicts only the waiting time of a task on an individual's 
priority list. What is the relationship between the observed 
interevent times and the predicted waiting times? The basic 
thesis of our paper is that the waiting times the various tasks 
experience on an individual's priority list is responsible for 
the heavy tailed distributions seen in the interevent times as 
well. The purpose of this section is to discuss the relationship 
between the two quantities. 

The model predictions, that the waiting time distribution 
of the tasks follows a power law, is directly supported by one 
dataset in each universality class: the email data and the corre- 
spondence data. As discussed in Sect. |ni| we have measured 
the waiting time distribution for both datasets, finding that the 
distribution of the response times indeed follows a power law 
with exponent a = 1 (email) and a ~ 3/2 (correspondence 
mail) as predicted by the models. Therefore, the direct mea- 
surement of the waiting times are likely rooted in the fat tailed 
response time distribution. For the other three datasets, how- 
ever, such as web browsing, library visits and stock purchases, 
we cannot determine the waiting time of the individual events, 
as we do not know when a given task is added to the individ- 
ual's priority list. 

To explore the broader relationship between the waiting 
times and the interevent times we must remind ourselves that 
while during the measurements we are focusing on a spe- 
cific task (like email), the models assume the knowledge of 
all tasks that an individual is involved in. Thus the empirical 
measurements offer only a selected subset of an individual's 



activity pattern. To see the relationship between r and next 
we discuss two different approaches. 

Queueing of different task categories: The first approach 
acknowledges the fact that tasks are grouped in different cat- 
egories of priorities: we often do not keep in mind specific 
emails to be answered, but rather remember that we need to 
check our email and answer whatever needs attention. Simi- 
larly, we may remember a few things that we need to shop for, 
but our priority list would often contain only one item: go to 
the supermarket. When we monitor different human activity 
patterns, we see the repetitive execution of these categories, 
like going to the library, or doing emails, or browsing the web. 
Given this, one possible modification of the discussed models 
would assume that the tasks we monitor correspond to specific 
activity categories, and when we are done with one of them, 
we do not remove it from the list, but we just add it back with 
some changed priority. That is, checking our email does not 
mean that we deleted email activity from our priority list, but 
only that next has some different priority. If we monitor only 
one kind of activity, then a proper model would be the follow- 
ing: we have L tasks, each assigned a given priority. After 
a task is executed, it will be reinserted in the queue with a 
new priority chosen from the same distribution 77(2;). If we 
now monitor the time at which the different tasks exit the list, 
we will find that the interevent times for the monitored tasks 
correspond exactly to the waiting time of that task on the list. 
Note that this conceptual model would work even if the tasks 
are not immediately reinserted, but after some delay r^. In- 
deed in this case the interevent time will be r = r^, + t^, 
and as long as the distribution from which is selected from 
is bounded, the tail of the interevent time distribution will be 
dominated by the waiting time statistics. 

Interaction between individuals: The timing of specific 
emails also depends on the interaction between the individuals 
that are involved in an email based communication. Indeed, 
if user A gets an email from user B, she will put the email 
into her priority list, and answer when she gets to it. Thus 
the timing of the response depends on two parameters: the 
receipt time of the email, and the waiting time on the prior- 
ity list. Consider two email users, A and B, that are involved 
in an email based conversation. We assume that A sends an 
email to B as a response to an email B sent to A, and viceversa. 
Thus, the interevent time between two consecutive emails sent 
by user A to user B is given by r = + t^, where is 
the waiting time the email experienced on user A's queue, and 
is the waiting time of the response of user B to A's email. 
If both users prioritize their tasks, then they both display the 
same waiting time distribution, i.e. P{t^) ~ (t^) " ™d 
P{'^w) ^ (''u?)""- Ill this case the interevent time distribu- 
tion P(r), which is observed empirically if we study only the 
activity pattern of user A, follows also P{t) ~ r^". Thus the 
fact that users communicate with each other turns the waiting 
time into an observable interevent times. 

In summary, the discussed mechanisms indicate that the 
waiting time distribution of the tasks could in fact drive the 
interevent time distribution, and that the waiting time and the 
interevent time distributions should decay with the same scal- 
ing exponent. In reality, of course, the interplay between the 
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two quantities can be more complex than discussed here, and 
perhaps even better mapping between the two measures could 
be found for selected activities. But these two mechanisms 
indicate that if the waiting time distribution is heavy tailed, 
we would expect that the interevent time distribution would 
be also affected by it. 



VIII. DISCUSSION 

Universality classes: As summarized in the introduction, 
the main goal of the present paper was to discuss the potential 
origin of the heavy tailed distributed interevent times observed 
in human dynamics. To start we provided evidence that in five 
distinct processes, each capturing a different human activity, 
the interevent time distribution for individual users follow a 
power law. Our fundamental hypothesis is that the observed 
interevent time distributions are rooted in the mechanisms that 
humans use to decide when to execute the tasks on their pri- 
ority list. To support this hypothesis we studied a family of 
queuing models, that assume that each task to be executed by 
an individual waits some time on the individual's priority list 
and we showed that queuing can indeed generate power law 
waiting time distributions. We find that a model that allows 
the queue length to fluctuate leads to a — 3/2, while a model 
for which the queue length is fixed displays a = 1. These 
results indicate that human dynamics is described by at least 
two universality classes, characterized by empirically distin- 
guishable exponents. Note that while we have classifed the 
models based on the limitations on the queue lenght, we can- 
not exclude the existence of models with fixed queue lenght 
that scale with a — 3/2, or models with fluctuating lenght 
that display scaling with a = 1, or some other exponents. 

In comparing these results with the empirical data, we find 
that email and phone communication, web surfing and library 
visitation belong to the a — I universality class. The corre- 
spondence patterns of Einstein, Darwin and Freud offer con- 
vincing evidence for the relevance of the a = 3/2 exponent, 
and the related universality class, for human dynamics. In 
contrast the fourth process, capturing a stock broker's activ- 
ity, shows a = 1.3. Given, however, that we have data only 
for a single user, this value is in principle consistent with the 
scattering of the exponents from user to user, thus we can- 
not take it as evidence for a new universality class. One issue 
still remains without a satisfactory answer: why does email 
and surface mail (Einstein, Darwin and Freud datasets) be- 
long to different universality classes? We can comprehend 
why should the mail correspondence belong to the 3/2 class: 
letters likely pile on the correspondent's desk until they are 
answered, the desk serving as an external memory, thus we 
do not require to remember them all. But the same argument 
could be used to explain the scaling of email communications 
as well, given that unanswered emails will stay in our mailbox 
until we delete them (which is one kind of task execution). 
Therefore one could argue that email based communication 
should also belong to the 3/2 universality class, in contrast 
with the empirical evidence, that clearly shows a = 1 IS,"?^. 

Some difficulty in comparing the empirical data with the 



model predictions is rooted in the fact that the models predict 
the waiting times, while for many real systems only the in- 
terevent times can be measured. It is encouraging, however, 
that for the email and the surface mail based commnunication 
we were able to determine directly the waiting times as well, 
and the exponents agreed with those determined from the in- 
terevent times. In addition we argued that in a series of pro- 
cesses the waiting time distribution determines the interevent 
time distribution as well (see Sect. I VIM . This argument closes 
the loop of the paper's logic, establishing the relevance of the 
discussed queueing models to the datasets for which only in- 
terevent times could be measured. We do not feel, however, 
that this argument is complete, and probably future work will 
strengthen this link. In this respect two directions are partic- 
ularly promising. First, designing queueing models that can 
directly predict the observed interevent times as well would 
be a major advance. Second, establishing a more general link 
between the waiting time and interevent times could also be 
of significant value. 

The results discussed in this paper leave a number of issues 
unresolved. In the following we will discuss some of these, 
outlining how answering them could further our understand- 
ing of the statistical mechanics of human driven processes. 

Tuning the universality class: As we discussed above, the 
discussed models provide evidence for two distinct univer- 
sality classes in human dynamics, with distinguishable expo- 
nents. The question is, are there other universality classes, 
characterized by exponents different from 1 and 3/2? If other 
universality classes do exist, it would be valuable not only to 
find empirical support for them, but also to identify classes of 
models that are capable of predicting the new exponents. 

In searching for new exponents we need to explore sev- 
eral different directions. First, if one inserts some power law 
process into the queuing model, that could tune the obtained 
waiting time distribution, and the scaling exponents. There 
are different ways to achieve this. One method, discussed in 
Appendix|A] is based on the hypothesis that while we always 
attempt to select the highest priority task, circumstances or 
resource availability may not allow us to achieve this. For ex- 
ample, our highest priority may be to get cash from the bank, 
but we cannot execute this task when the bank is closed, mov- 
ing on to some lower priority task. One way to account for 
this is to use a probabilistic selection protocol, assuming that 
the probability to choose a task with priority x for execution 
in a unit time is Ii{x) ~ x'' , where 7 is a parameter that inter- 
polates between the random choice limit (ii) (7 = 0, p = 0) 
and the deterministic case, when always the highest priority 
item is chosen for execution (iii) (7 = 00, p 1). As shown 
in the AppendixIXI the exponent a will depend on 7 as 



+ (16) 

7 

At this moment we do not have evidence that such preferential 
selection process acts in human dynamics. However, detailed 
datasets and proper measurement tools might help up decide 
this by measuring the function I\-{x) directly, capturing the se- 
lection protocol. Such measurements were possible for com- 
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plex networks, where a similar function drives the preferential 
attachment process 1 1 0. .4 L .42. .43.. .44, 45J . 

As we discussed above, the main goal of this paper was 
to demonstrate that the queuing of the tasks on an individual's 
priority list can explain the heavy tailed distributions observed 
in human activity patterns. To achieve this, we focused on 
models with Poisson inputs, meaning that both the arrival time 
and the execution time are bounded. In some situations, how- 
ever, the input distributions can be themselves heavy tailed. 
This could have two origins: (/) Heavy tailed arrival time dis- 
tribution: As we show in Fig. |3j), there is direct evidence for 
this in the email communication datasets: we find that the in- 
terevent time of arriving emails can be roughly approximated 
with a power law with exponent = 1. (ii) The execu- 
tion time could also be heavy tailed, describing the situation 
when most tasks are executed very rapidly, while a few tasks 
require a very long execution time. Evidence for this again 
comes from the email system: the file sizes transmitted by 
email are known to follow a heavy tailed distribution 1 3 3. .34.1 . 
Therefore, if we read every line of an email, in principle the 
execution time should also be heavy tailed (i.e. the time we 
actually take to work on the response, including reading the 
original email). Note, however, that measurements failed to 
establish a correlation between email size and the response 
time |5]. It is not particularly surprising that both (/) and (//) 
would significantly impact the waiting time distribution, gen- 
erating a heavy tailed distribution for the waiting times even 
when the behavior of the model otherwise would be exponen- 
tial, or change the exponent a, thus altering the model's uni- 
versality class. Some aspects of this problem were addressed 
recently by Blanchard and Hongler 14611 . However, to illus- 
trate the impact of the heavy tailed inputs in Appendix lEI we 
study the model of Sect. |V]with a heavy tailed service time 
distribution h{s) ^ s"'^ with < (3 < 1. 

Finally, could the power law distributed arrival and execu- 
tion times serve as the proper explanation for the observed 
heavy tailed interevent time distribution in human dynamics? 
Note that in a number of systems we observe heavy tailed dis- 
tributed events without evidence for power law inputs. For 
example, the timing of the library visits or stock purchases 
by brokers does not appear to be driven by any known power 
law inputs, and they have negligible execution time compared 
with the average observed interevent times. Similarly, the be- 
ginning of online games or instant messages is not driven by 
file sizes either, but only by the time availabihty for playing a 
game or sending a message, which is mostly a priority driven 
issue. Therefore, while it is important to understand the im- 
pact of power law inputs on the scaling properties of various 
models, attempts to explain the waiting times solely based on 
the heavy tailed inputs only delegate the problem to an earlier 
cause (the origin of the power law input). 

Potential model extensions: Guided by the desire of con- 
structing the simplest models that capture the essence of task 
execution, we have neglected many processes that are obvi- 
ously present in human dynamics. For example, we assumed 
that the priority of the tasks is assigned at the moment the 
task was added to an individual's priority list, and remains 
unchanged for the rest of the queuing process. In reality the 



priorities themselves can change in time. For example, many 
tasks have deadlines |46], and one could assume that a task's 
priority diverges as the deadline is approaches. Even in the ab- 
sence of a clear deadline some priorities may incease in time 
|46|, others may decrease. Sometimes external factors change 
suddenly a task's priority- for example, the priority of water- 
ing the lawn suddenly diminishes if it starts raining. The pos- 
sibility of dropping tasks, either by not allowing them on the 
queue, or by simply deleting them from the queue, could also 
affect the waiting time distributions. Tasks could be dropped 
if they were not executed for a considerable time interval, and 
thus become irrelevant, or when the individual is very busy, 
or some may be simply forgotten. Obviously, the precise im- 
pact on the waiting time distribution will depend on the im- 
plementation of the task dropping conditions. It is important 
to understand if any or all of these processes could change the 
universality class of the waiting time distribution. 

Model limitations: The studied datasets do not capture all 
tasks an individual is involved in, but only the timing of se- 
lected activities, hke sending emails or borrowing books from 
the library. Yet, we must consider the fact that between any 
two recorded events individuals participate in many other non- 
recorded activities. For example, if we find that an individual 
clicks on a new document every few seconds, likely he/she is 
fully concentrating on web browsing. However, when we no- 
tice a break of hours or days between two consecutive clicks, 
it is clear that in the meantime the individual was involved in 
a myriad of other activities that were not visible to us. The 
queuing models discussed here were designed to take into 
consideration all human activities, as we assume that the prior- 
ity list of an individual contains all tasks the person is involved 
in. Currently an understanding of the interplay between the 
recorded and the invisible activities is still lacking. 

Task optimization: The order in which we execute differ- 
ent tasks is often driven by optimization: we try to minimize 
the total time, or some cost functions. This is particularly 
relevant if the execution times depend on the order in which 
the tasks are executed. For example, often executing a cer- 
tain task might be faster if we execute some other preparatory 
tasks before, and not in the inverse order In principle opti- 
mization could be incorporated into the studied models by as- 
suming that they determine the priority of the tasks. Optimiza- 
tion raises several important questions for future work: How 
should we model optimization driven queueing processes? 
Can they also lead to power laws, and if so, will they result 
in new universality classes? 

Correlations: So far we have focused on the origin of the 
various distributions observed in human dynamics. Distribu- 
tions offer little information, however, about potential cor- 
relations present in the observed time series. Such correla- 
tions were documented in Ref. i23l . observing that the cor- 
relation function of the interevent time series for printing job 
arrivals decayed as a power law. Are such temporal correla- 
tions present in other systems as well? What is their origin? 
Can the queuing models predict such correations? Answers 
to these questions could not only help better understand hu- 
man dynamics, but could also aid in distingushing the various 
models from each other 
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Network effects: In seaching for the explanation for the ob- 
served heavy tailed human activity patterns we limited our 
study to the properties of single queues. In reality none of our 
actions are perforned independently-most of our daily activity 
is embedded in a web of actions of other individuals 147. 431 . 
Indeed, the timing of an email sent to user A may depend on 
the time we receive an email from user B. An important fu- 
ture goal is to understand how the various human activities 
and their timing is affected by the fact that the individuals are 
embedded in a network environment. 

Non-human activity patterns: Heavy tailed interevent time 
distributions do not occur only in human activity, but emerge 
in many natural and technological systems. For example, 
Omori's law on earthquaqes 149. 50.1 records heavy tailed in- 
terevent times between consecutive seismic activities; mea- 
surements indicate that the fishing patters of seabirds also dis- 
play heavy tailed statistics |51|; plasticity paterns |52| and 
avalanches in lungs |53] show similar power law interevent 
times. While a series of models have been proposed to capture 
some of these processes individually, there is also a possibil- 
ity that some of these modeling frameworks can be reduced to 
various queuing processes. Some of the studied queuing mod- 
els show close relationship to several models desi gned to cap- 
ture self-organized criticality |54, 55, 56, 57, 58, '59']. Could 
the mechanisms be similar at some fundamental level? Even if 
such higher degree of universality is absent, understanding the 
mechanisms and queuing processes that drive human dynam- 
ics could help us better understand other natural phenomena 
as well, from the timing of chemical reactions in a cell to the 
temporal records of major economic events [Rt'] or the tim- 
ing of events in manufacturing processes and supply chains 
Sleillei or panic |63|1. 



averaging over t weighted with f{x, t), providing 
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i.e. the higher an item's priority, the shorter is the average 
time it waits before execution. To calculate P(r) we use the 
fact that the priorities are chosen from the r]{x) distribution, 
i.e. rj(x)dx — P{T)dT, which gives 
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providing the relationship M6\ between a and 7, and indi- 
cating that with changing 7 we can continuously tune a as 
well. In the 7 cx) limit, which converges to the strictly 
priority based deterministic choice {p — 1) in the model, Eq. 
(IA2> predicts w{tw) ~ i^Z^j in agreement with the numerical 
results (Fig 3a), as well as the empirical data on the email in- 
terarrival times (Fig 2a). In the 7 = (p = 0) limit tw{x) 
is independent of x, thus w{tyj) converges to an exponential 
distribution, as shown in Fig. 3b. 

The apparent dependence of w{ti^) on the rj{x) distribu- 
tion from which the agent chooses the priorities may appear 
to represent a potential problem, as assigning priorities is a 
subjective process, each individual being characterized by its 
own rj{x) distribution. According to Eq. ( IA2> . however, in 
the 7 ^ cx) limit w{tw) is independent of r]{x). Indeed, in the 
deterministic limit the uniform r]{x) can be transformed into 
an arbitrary 77' (x) with a parameter change, without altering 
the order in which the tasks are executed 1 3 1 ] . This insensi- 
tivity of the tail to r]{x) explains why, despite the diversity of 
human actions, encompassing both professional and personal 
priorities, most decision driven processes develop a heavy tail. 



APPENDIX A: PREFERENTIAL SELECTION PROTOCOL 



As we discussed in Sect. IVIIII one possible modification 
of the priority model introduced and studied in Sect. |V1] in- 
volves the assumption that we do not always choose the high- 
est priority task for execution, but rather the tasks are chosen 
stochastically, in increasing function of their priority. That is, 
the probability to choose a task with priority x for execution 
in a unit time is Il{x) ^ x'', where 7 is a predefined parameter 
of the model. This parameter allows us to interpolate between 
the random choice limit (ii) (7 = 0, p = 0) and the determin- 
istic case, when always the highest priority item is chosen for 
execution (iii) (7 = cx), p = 1). Note that this parameteriza- 
tion captures the scaling of the model discussed in Sect. IVII 
only in the p and p ^ 1 limits, but not for intermedi- 
ate p values. That is, the two limits of this model map into 
extreme limits of the model introduced in Sect. IVII but the 
intermediate p and 7 values do not map into each other. 

The probability that a task with priority x waits a time inter- 
val t before execution is f{x,t) = (1 — n(a;))*~^n(a:). The 
average waiting time of a task with priority x is obtained by 



APPENDIX B: EXACT SOLUTION OF THE PRIORITY 
QUEUE MODEL WITH i = 2 

Consider the model discussed in Sect. IVII |^ with L = 2 
101. The task that has been just selected and its priority 
has been reassigned will be called the new task, while the 
other task will be called the old task. Let ri{x) and R{x) — 
dxf}{x) be the priority probability density function (pdf) 
and distribution function of the new tasks, which are given. In 
turn, let fj{x^t) and t) = /(, dxfj{x, t) be the priority pdf 
and distribution function of the old task in the t-th step. At the 
[t + l)-th step there are two tasks on the list, their priorities 
being distributed according to R{x) and R{x, t), respectively. 
After selecting one task the old task will have the distribution 
function 



R{x,t + 1)= / dx'ri{x',t)q{x') + / dx'ri{x)q{x' ,t) , 
Jo Jo 

(Bl) 
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where 



qix) = p[l - R{x)] + (1 - p) 



1 



(B2) 



is the probability that the new task is selected given the old 
task has priority x, and 



qix)^p[l-Rix,t)] + il-p)- (B3) 

is the probability that the old task is selected given the new 
task has priority x. In the stationary state, R{x,t + 1) = 
R{x, t), thus from JBU we obtain 



R{x) 



1 +p 



1 + 



2p 
1-p 



R{x) 



(B4) 



Next we turn our attention to the waiting time distribution. 
Consider a task with priority x that has just been added to 
the queue. The selection of this task is independent from one 
step to the other Therefore, the probability that it waits T(„ 
steps is given by the product of the probability that it is not 
selected in the first — 1 steps and that it is selected in the 
Tju-th step. The probability that it is not selected in the first 
step is q{x), while the probability that it is not selected in 
the subsequent steps is q{x). Integrating over the new task's 
possible priorities we obtain 



dR{x) [1 - q{x)] , = 1 

dR{x)q{x) [1 - q{x)] q{x 
Using ( IB2t -( IB4l and integrating ( IB5l l we finally obtain 



(B5) 
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Ttt, > 1 

(B6) 

Note that P{tw) is independent of the r]{x) pdf from which 
the tasks are selected. Indeed, what matters for task selection 
is their relative order with respect to other tasks, resulting that 
all dependences in (IB2> - JB4t and jB5> appears via R{x). 



APPENDIX C: THE ASYMPTOTIC CHARACTERISTICS OF 

In Sect. I VII we focused on a model with fixed queue length 
L, demonstrating that it belongs to a new universality class 
with a ~ 1. Next we derive a series of results that apply 
to any queuing model that has a finite queue length, and is 
characterized by an arbitrary task selection protocol | 8]. In 



each time step there are L tasks in the queue and one of them 
is executed. Therefore 
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where t; is the waiting time of the task executed at the i-th 
step and t[, i — 1, . . . , L — 1, is the time interval that task 
i, that is still active at the t-t\\ step, has akeady spent on the 
queue. The first term in the l.h.s. of ( ICU corresponds to the 
sum of the waiting times experienced by the t tasks that were 
executed in the t steps since the beginning of the queue, while 
the second term describes the sum of the waiting times of the 
L — 1 tasks that are still on the queue after the t step. Given 
that in each time step each of the L tasks experience one time 
step delay, the sum on the l.h.s. should equal Lt. From jCU 
it follows that 
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If all active tasks have a chance to be executed sooner or later, 
like the case for the model studied in Sects. I VII in the < 
p < 1 regime f?], we have (r^,) < {tw) and the last term 
in iC2\ vanishes when < ^ oo. In contrast, for p = 1 the 
numerical simulations |5] indicate that after some transient 
time the most recently added task is always executed, while 
L — 1 tasks remain indefinitely in the queue. In this case t/ ^ t 
in the t oo limit and the last term in (IC2t is of the order 
of L — 1. Based on these arguments we conjecture that the 
average waiting time of executed tasks is given by 



<p < 1 



(C3) 



which is corroborated by numerical simulations (see Fig. |6j3). 

It is important to note that the equality in ( IC2t is indepen- 
dent of the selection protocol, allowing us to reach conclu- 
sions that apply beyond the model discussed in Sect. IVII From 
(|C2} we obtain 



< L 



(C4) 



From this constraint follows that P{tw ) must decay faster than 
when oo, otherwise (r^) would not be bounded. 

Indeed, it is easy to see that for any a < 2 the average waiting 
time (r^,) diverges forEq. Q. Thus, when oo, we must 

either have 



ar,, 



,a > 2 



To 



(C5) 



(C6) 
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where tq > and f{x) = C(6x"~^) when x oo, where 
6 is a constant. That is, each time an a < 2 exponent is ob- 
served (as it is for the empirical data discussed in Sect. IIIH . an 
exponential cutoff must accompany the scaling. For example, 
for the model discussed above with L = 2 and < p < 1 we 
have a = 1 and f{x) decays exponentially (|9|l, in line with 
the constraint discussed above. 



APPENDIX D: TRANSITIONS BETWEEN THE TWO 
UNIVERSALITY CLASSES 

A basic difference between the models discussed in Sect. |V| 
and Sects. IVll is the capacity of the queue. Our results indicate 
that the model without limitation on the queue length displays 
a — 3/2, rooted in the fluctuations of the queue length. In 
contrast, the model with fixed queue length (Sect. IVll i has 
a = 1, rooted in the queuing of the low priority tasks on the 
priority list. If indeed the limitation in the queue length plays 
an important role, we should be able to develop a model that 
can display a transition from the a ~ 3/2 to the a — 1 uni- 
versality class as we limit the fluctuations in the queue length. 
In this section we study such a model, interpolating between 
the two observed scaling regimes. We start from the model 
discussed in Sect. |Vj and impose on it a maximum queue 
length L. This can be achieved by altering the arrival rate of 
the tasks: when there are L tasks in the queue no new tasks 
will be accepted until at least one of the tasks is executed. 
Mathematically this implies that the arrival rate depends on 
the queue length as 



A , o<e<L 

0, £^L. 



(Dl) 



In the stationary state the queue length distribution P{£) sat- 
isfies the balance equation 



\e-iPii - 1) + ^ii+iP{e + 1) = (A, + fii) P{e) , (D2) 
where 



fie = 



0, e^o 

fi, Q<£<L 



(D3) 



From iD2i we obtain the queue length distribution as 



P{£) 



l-pL+l' 



(D4) 



suggesting the existence of three scaling regions. 

Subcritical regime, p <C 1: If the arrival rate of the tasks is 
much smaller than the execution rate, the fact that the queue 
length has an upper bound has little significance, since I will 
rarely reach its upper bound L, but will fluctuate in the vicinity 
of £ = 0. This regime can be reached either for p ^ 1 and 
L fixed or for p < 1 and L ^ 1. Therefore, in this case the 




FIG. 8: Waiting time distribution for tasks in the queueing model 
discussed in Sect. |D| with a maximum queue length L. The waiting 
time distribution is plotted for three L values: L = 10 (circles), L — 
100 (squares) and L = 1000 (diamonds). The data has been rescaled 
to emphasize the scaling behavior P{tw) — t^^^^ fir^/To), where 
To ~ L^. In the inset we plot the waiting time for p — 10®, showing 
the crossover to the model discussed in Sect. lVll in the limit p ^ oo 
and L fixed. 



waiting time distribution is well approximated by that of the 
model with an unlimited queue length, displaying the scaling 
predicted by Eq. 0, i.e. either exponential, or a power law 
with a = 3/2, coupled with an exponential cutoff (see Fig. 
it). 

Critical regime: For p = 1 we observe an interesting inter- 
play between the queue length and L. Normally in this critical 
regime £{t) should follow a random walk with the return time 
probability density scaling with exponent 3/2. However, the 
limitation imposed on the queue length limits the power law 
waiting time distribution predicted by Eq. @, introducing a 
cutoff (see Fig. |SJ)). Indeed having the number of tasks in the 
queue limited allows each task to be executed in a finite time. 

Supercritical regime: When p 3> 1 from (ID4> follows that 



1 - o(p- 



0<£<L 
£^L, 



(D5) 



i.e. with probability almost one the queue is filled. Thus, in 
the supercritical regime p ^ 1 new tasks are added to the 
queue immediately after a task is executed. If we take the 
number of executed tasks as a new reference time then this 
model corresponds to the one discussed in Sect. IVII display- 
ing a = 1 |5], as supported by the numerical simulations (see 
Fig.H). 



APPENDIX E: HEAVY TAILED INPUT DISTRIBUTIONS 

In this Appendix we study the model discussed in Sec. |V] 
with a heavy tailed service time distribution h{s) ~ with 
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< /? < 1. In this case it has been shown that jssil 



P{rw) - T-l" . (El) 

This result is a consequence of the generalized limit theorem 
for heavy tailed distributions [JJ. Let us focus on a selected 
task and assume that m tasks need to be executed before it. 
Therefore, the selected task's waiting time is given by 

m 

Tu, = ^ S; , (E2) 

1=1 

where si is the service time of the /-th task executed before 
the given task. Equation JE2> represents the sum of m in- 
dependent and identically distributed random variables, with 
pdf h{s) ~ s^'^, which is known to follow a pdf with the same 
heavy tail, and thus resulting in ( IE1> . Hence, in this case the 
heavy tail in the waiting time distribution is a consequence of 
the heavy tails in the service time distribution. 
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